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Abstract 

Reconstructing a finite set of curves from an unordered set of sample 
points is a well studied topic. There has been less effort that considers how 
much better the reconstruction can be if tangential information is given as 
well. We show that if curves are separated from each other by a distance 5, 
then the sampling rate need only be 0{\/S) for error-free reconstruction. 
For the case of point data alone, 0{5) sampling is required. 



1 Introduction 

In this paper, we consider the problem of reconstructing a figure - that is, 
a family of curves {7i(i)}o...M-i from a finite set of data. More precisely, we 
assume we are given an unorganized set of points {'Pi\i=o...N-i, as well as unit 
tangents to the points {mi\i=a...N-i- Note that the tangents have no particular 
orientation; making the change fhi — > —rfii destroys no information. 

Definition 1.1 A polygonalization of a figure {7j(i)}o...A/~i is a planar graph 
r = {V, E) with the property that each vertex p € V is a point on some "fi{t), 
and each edge connects points which are adjacent samples of some curve ji . 

Our goal here is to construct an algorithm which reconstructs the polygonal- 
ization of a figure from the data defined above. An example of a polygonalization 
is given in Figure [1] 

The topic of reconstructing figures solely from point data {pi}i=o...N-i has 
been the subject of considerable attention [31 21 [HI [HI [H [IHl E] ■ This is actu- 
ally a more difficult problem, and only weaker results are possible. The main 
difficulty is the following; if the distance between two separate curves 7^ and 7^- 
is smaller than the sample spacing, then it is difficult to determine which points 
are associated to which curve. Thus, sample spacing must be 0{S), with S the 
distance between different curves. 

Tangential information makes this task easier; in essence, if two points are 
nearby (say pi and P2), but ±mi does not point (roughly) in the direction 
P2—pi, thenpi andp2 should not be connected. This fact allows us to reduce the 
sample spacing to 0{S^/^), rather than 0{S). This is to be expected by analogy 
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Curves and polygonalizations 




Figure 1: A figure and it's polygonalization, c.f. Definition 11.1 



to interpolation; knowledge of a function and its derivatives yields quadratic 
accuracy. 

We should mention at this point related work on Surfels (short for Surface 
Elements). A surfel is a point, together with information characterizing the 
tangent plane to a surface at that point (and perhaps other information such as 
texture). They have become somewhat popular in computer graphics recently, 
mainly for rendering objects characterized by point clouds [H [2l [6l \TE \ fT7| , fTS]. 

In this work, we present an algorithm which allows us to reconstruct a curve 
from {pi,rhi}i=o...N-i- We make two assumptions, under which the algorithm 
is provably correct. 

Assumption 1 We assume each curve ^i{t) = {xi(t),yi(t)) has bounded cur- 
vature: 



Vi = . . . M - 1, 



\xmym~yimit)\ 



(1.1) 



This assumption is necessary to prevent the curves from oscillating too much 
between samples. 



Assumption 2 We assume the curves ji and jj are uniformly separated from 
each other, i.e.: 

M\j,it)-j,it')\>S fori^j (1.2a) 
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Figure 2: An illustration of Assumption [2l The black arrow illustrates the 
condition (|1.2ap , while the red arrow illustrates the condition (jl.2bp . 



We also assume that different areas of the same curve are separated from each 
other: 

inf |7.W- 7.(01 ><5 (l-2b) 

lt-t'|>K™-i7r/2 

(assuming the curve "fi(t) proceeds with unit speed). 

These assumptions ensure that two distinct curves do not come too close 
together ^ and that separate regions of the same curve do not come arbitrarily 
close This is illustrated in Figure [2l 



2 The Reconstruction Algorithm 

Before we begin, we require some notation. 

Definition 2.1 For a vector v, let denote the vector v rotated clockwise by 
an angle Tr/2. 

Definition 2.2 Let d{p, q) denote the usual Euclidean metric, d{p, q) — \p — q\. 
Letd^{j},q) denote the distance in therh direction between p and q, i.e. d^{p,q) = 
\{p- q) ■ to|. 

Definition 2.3 For a point p and a curve we say that p £ ^i{t) if3t such 

that 7i (t) = p. 
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Forbidden regions 
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Figure 3: The forbidden zones, as described in Lemma [2751 The orange (darker 
region) is the forbidden zone, and the blue (hghter region) is the set of points a 
distance 7rKm~^/2 away from pi. 

2.1 The Forbidden Zone 

Before explaining the algorithm which constructs the polygonalization of a figure 
(the set of curves {ji{t)}o...M-i) from discrete data {pi, TOi}i=o...A'-i, we prove 
a basic lemma which forms the foundation of our method. We assume for the 
remainder of this section that the figure satisfies Assumption 1. 

Definition 2.4 For a point Pi, we refer to the set ^±B^^^-i{pi ± rn'^Km~^) as 
its forbidden zone, illustrated in Fig. O Here, Bj. (p) is the usual ball of radius 
r about p. 




Lemma 2.5 For every i ^ j, if Pj is in the forbidden zone of pi, then {pi,pj) 
is not an edge in T assuming that the sample spacing is less than Km~^7r/2. 

Proof. Suppose for simphcity that pi — (0, 0) and rfii = (1, 0). Now, consider a 
line r(i) of maximal curvature. The curve of maximal curvature, with Ty{t) > 
and proceeding at speed Km~^ is ''"^{t) = {Km~^ sin(t), Km~^i^ — cos(<))), while 
the curve with T^(i) < is T^{t) ~ {Km^^ sin(t), Km~^{cos{t) — 1)). 

By assumption 1, the curve 7(i) containing pi must lie between these curves 
(the near boundaries of the forbidden zone in Fig l2.5p . Thus, it is confined to 
the blue (lighter) region while its arc length is less than k„i^^tt/2. lipj is in the 
forbidden zone and ^(t) connects pi to pj, then it must do so after travelling a 
distance greater than Km^^7r/2. □ 
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Proximity-based reconstruction 




Point/Tangent-based reconstruction 




Figure 4: A naive proximity-based reconstruction algorithm (shown), or even a 
/3-crust type algorithm, will introduce edges between different curves. Knowl- 
edge of the forbidden zone allows us to remove such edges. 

In short, the extra information provided by the tangents allows us to exclude 
edges from the polygonalization if they point too far away from the tangent, 
resulting in higher fidelity (c.f Fig. 0]). 

Definition 2.6 For a point p, we define the allowed zone or allowed region 
AM by 

Mp) ^ BM)\[^±B^„,-l{p^±rniK„-')] (2.1) 
That is, (p) is the ball of radius e about p excluding the forbidden zone. 

Clearly, any edge in the polygonalization starting at p, with length shorter 
than e, must connect to another point qCz A^{p). We are now ready to describe 
the polygonalization algorithm. 



Algorithm 1 (Noise-Pree Polygonalization) 



Input: [ We assume we are given the dataset {pi , mi},-o...Af-i; the maximal 
curvature Km, and a parameter e satisfying both enm < l/\/2 and 2K,,n^^ < S. 
We assume that adjacent points on a given curve are less than a distance e 
apart, i.e. the curve is e-sampled. ] 
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1. Compute the graph G = {{Pi}i=o...N-i, E) with edge set: 
E = {{Pi.Pj) ■■ Pi e A^{pj) andpj £ A^{pi)} 



2. For each vertex pi G {pi}i=Q...N-i-' 

a. Compute the set of vertices 

Rf = {pj : {pi,Pj) € E and ± {pj - pi) ■ ■m^ > 0} 

b. Find the nearest tangential neighbors, i.e. 

rf = argmin^^p.±d^^{q,pi) 

3. Output the graph T = ({pi}i=o...iV-ij £'') with 

^^' = {(pl,^t)}U{(K,rr)} 

This graph is the polygonalization o/ {7i(t)}o..._A/-i. 

Remark 2.7 As presented, the complexity of Algorithm 2] is 0{N'^), due to 
both step 1 and step 2. (Step 2 can be slow if 0(7V) points are within the allowed 
region of some particular point). The complexity can be reduced to 0{N log N) 
using quadtrees if we assume a minimal sampling rate (see Appendix [B]) . 

The following theorem guarantees the correctness of Algorithm |4l Its proof 
is presented in the next section. 

Theorem 2.8 Suppose that: 

S > 2Kme^ (2.2a) 
where 6 is as in Assumptions^ and also 

e < (2.2b) 

Suppose also that the distance between adjacent samples in the polygonalization 
is bounded by e, i.e. the curve is e-sampled. Then graph T returned by Algorithm 
\^is the polygonalization o/ {7i(t)}o..._A/-i. 

2.2 Proof of Theorem [278] 

Lemma 2.9 Suppose i ^ j and that Assumptions^ holds. Then for all t,t', if 
(|2?2l) holds, then 

-f,{t') ^ AMt)). (2.3) 
Similarly, if i — j and \t — t'\ > then p.3p holds. 
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Proof. Fix t, and define p = ^i{t) and m = li{t)/ \li(t)\- Define L to be the 
line segment L — {p + rhKm~^ sm{6) : 9 G [— arcsin(eKm), arcsin(eK,„)]}. Tfie 
boundaries of A^{p) are given by 

p + rhum^^ am{d) ± 'm'^ftm^^{l — cos(6')). 

Now, for any q (z ji and q At{p), the distance between q and L is the normal 
distance to L. This distance is bounded by: 

d{q,L) < supKm"^ 1(1 - cos(6'))| 
e 

< supKm^^2sin^(6'/2) = 2^™"^ sin^(arcsin(eK„j)/2) (2.4) 

9 

The intermediate value theorem implies arcsin(a;) < arcsin'(C)a: = (1 — (^)^^/^x 
for some C G [0,a;]; since eK„i < 2^^/^ (by (|2.2bp ). we find that: 

arcsin(eKm) < (1 — {2^^^^)"^)^^^^ Km£ — V^KmC 
Substituting this into (|2.4p yields: 

d{q, L) < 2K„r^ ain'^{V2K,ne/2) < Kme^ (2.5) 

Thus, the normal distance between any point in A^{p) and L is O(Kme^). 

If 7j(t') ^ L + m^R, then clearly 7j(<') ^ A(:(7i(i)) so we assume ^j{t') e 
L + m-'-M. In this case, 7j(t') = p + mKm~"^ sin(f?o) + m-'-Zj for some G 
[— arcsin(eKm), arcsin(eKm)] and Zj e M. Thus, = d^±{'jj{t'), L), the 
normal distance to L. By construction, there is a unique value such that 
7i(^i) = p + TOKm""'^ sin(0o) +rn'^Zi. \zi\ then equals d^± {'-fi{t) , L) . By the 
second triangle inequality, 

drn^{lj{t'),L) = > ||Zj - Zi\ - |zi|| > 5- K,„e^ > K,„e^ 

But this implies that d{'-fj{t'),L) > d^±{'~fj{t'),L) > Hm(^ , and thus ^j{t') ^ 
AM- 

The proof when i ^ j is identical. □ 

This result shows that the graph G, computed in Step 1 of Algorithm 01 sep- 
arates different 7^ and 7^ from each other, as well as different parts of the same 
curve. Thus, after Step 1 , we are left with a graph G having edges only between 
points Pi and pj which are on the same curve 7fc, and which are separated along 
7fe by an arc length no more than KmT^T^ 12. 

We now show that G is a superset of the polygonalization F. 

Proposition 2.10 Suppose the point data {pi}i^Q,,,N-i is e- sampled, i.e. if two 
points Pi and Pj are adjacent on the curve jk, then the arc length between pi 
and Pj is bounded by e. Then G contains the polygonalization of {^i{t)}o...M-i. 
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Proof. If the distance between adjacent points pi and pj is at most e, then 
Pj £ Bg{pi). Since the segment of 7fc between pi and pj has arc length less than 
e, Pj is not in the forbidden zone of (by the same argument as in Lemma l2.5l 
Thus, Pj e A^{pi) (and vice versa), and {pi,pj) is an edge in G. □ 

We have now shown that G separates distinct curves, and that G contains 
the polygonalization Y of {7i(i)}o...jif-i- It remains to show that G — T. 

Lemma 2.11 A curve Ji(t) satisfying (jl.ip admits the local parameterization 

7,(t) = j{to) + {t- to)j'{to) + w(t)j'^{to) (2.6) 

where w'{tQ) — 0. The parameterization is valid for |i — to| < Km^^. In partic- 
ular, w{t) < f^^{Kmt) where f{z) = z/Vl + z^- 

Proof. Taylor's theorem shows the parameterization to be valid on an arbi- 
trarily small ball. All wc need to do is show that this parameterization is valid 
on a region of size Km^^. 

The parameterization breaks down when w'{t) blows up, so we need to show 
that this does not happen before t — e. Plugging this parameterization into the 
curvature bound p.ip yields: 

\w"(t)\ 



(l + w;'(t)2)3/2 

Assuming w" {t) is positive, this is a first order nonlinear differential inequal- 
ity for w'{t). We can integrate both sides (using the hyperbolic trigonometric 
substitution w{t) = sinh(6') for the left side) to obtain: 

<Kmt. (2.7) 



With /(z) defined as in the statement, then f~^(z) is singular only at z — ±1, 
and is regular before that. Solving (j2.7p for vu'{t) shows that: 

W'{t) < f-Hnmt) 

implying that u]'{t) is finite for Kmt < 1, or t < Km~^- D 



Lemma 2.12 Fix a point pi — p € {pi}i=o...N-i- Choose a tangent vector rho 
and fix an orientation, say +. Consider the set of points pj such that {p,Pj) is 
an edge in G and {pj — p) • mo > 0. Suppose also that e satisfies (j2.2bp . Then, 
the only edge in the polygonalization of 7 is the edge for which (pj — p) ■ toq is 
minimal. 

Proof. By Lemma [2 .111 the curve ^[t) can be locally parameterized as a graph 
near p, i.e. (|2.6p . This is valid up to a distance by (j2.2bp . it is vahd for 

all points in the graph G connected to p. 
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Point and Tangent data 




-0.5 0.0 0.5 1.0 

Reconstructed curves 




Figure 5: Some unordered points/tangents, and the curves reconstructed from 
them. In this case, e = 0.065, Km = 3 and S = 0.015. 

The adjacent points on the graph are the ones for which |t — to| is minimaL 
Note that mo ■ (Pj ~P) — t (simply plug in (|2.6p ): thus, minimizing toq • {pj —p) 
selects the adjacent point on the graph. □ 

The minimal edge is the edge rj| as computed in Step (2b) of Algorithm 21 
Thus, we have shown that the computed graph G is the polygonalization F of 
{li{t)}o...M~i- 



3 Reconstruction in the Presence of Noise 

In practice one rarely has perfect data, so it is important to understand the per- 
formance of the approach in the presence of errors. To that end, we consider the 
polygonalization problem, but with the point data perturbed by noise smaller 
than C and the tangent data perturbed by noise smaller than ^. By this we mean 
the following; to each point pi £ {pi\i=Q...N-i, there exists a point p^^* = 7^. {ti) 
such that \pi — < C- Similarly, the unit tangent vector rfii differs from the 
true tangent m^.* = I'kX^i) by an angle at most ^. By a polygonalization of 
the noisy data, we mean that (pi,pj) is an edge in the noisy polygonalization 
if (pi^<,,pj^ir) is an edge in the noise-free polygonalization. In what follows, pj 
refers to a given (noisy) point, while pj^^ refers to the corresponding true point 
(and similarly for tangents). 

Noise, of course, introduces a lower limit on the features we can resolve. At 
the very least, the curves must be separated by a distance greater than or equal 
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to C, to prevent noise from actually moving a sample from one curve to another. 
In addition, noise in the tangent data introduces uncertainty which forces us to 
increase the sampling rate; in particular, we require 0(e^ + e^) < 6. 

The main idea in extending Algorithm 0] to the noisy case is to expand the 
allowed regions to encompass all possible points and tangents. Of course, this 
imposes new constraints on the separation between curves. 

We also require a maximal sampling rate in order to ensure that the order 
of points on the curve is not affected by noise. For work in the context of 
reconstruction using point samples only, see [71 [TB] . 

Assumption 3 We assume that adjacent points pi and pj on the curve ^k{t) 
are separated by a distance greater than [(1 + 2^/^)(2^e + ()]. 

To compensate for noise, we expand the allowed region to account for un- 
certainty concerning the actual point locations. 

Definition 3.1 The noisy allowed region A^'^{pi) is the union of the allowed 
regions of all points/tangents near {pi,rni): 

Ai'Hp^)= U {BM\[^±B.^-iip±rn^K„r')]) (3.1) 

|p-p.|<C 

arccos(mi -m) <^ 



Algorithm 2 (Noisy Polygonalization) 



Input: [ We assume we are given the dataset {pi, mi}i=o...JV-i, the maximal 
curvature Km, the noise amplitudes C, ^, and a parameter e satisfying both eKm < 
I/V2 and 4C + 2e^ + 2.1 Km(''^ < We assume that adjacent points on a given 
curve are less than a distance e apart, i.e. the curve is e-sampled. ] 

1. Compute the graph G — ({pi}i=o...iV~i5 ^) with edge set: 

E = {{p,,p,) : B^[p,) n ^ and B^[p,) H A^^^ip,) ^ 0} (3.2) 

2. For each vertex pi € {pi}i=o...N~i-' 

a. Compute the set of vertices 

Rf = {pj : {pi,Pj) e E and ± {pj - p*,) • > 0} 

b. Find the nearest tangential neighbors, i.e. 

rf = argmin^^j^±d^^{q,pi) 
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Input data 
1.2F ' ' ' 




-0.2 b , I , , , : 

0.0 0.5 1.0 1.5 2.0 2.5 3.0 

Reconstructed curve 

1.2F ' ' ' ' ' - 




-0.2l , , , , , : 

0.0 0.5 1.0 1.5 2.0 2.5 3.0 



Figure 6: Noisy sampled points, and the reconstruction by Algorithm [2l This 
example takes = 5, e = 0.15, ( = = 0.01. 

3. Output the graph T — i{pi}i=Q...N-iT E') with 

E'^m,7t)}um.,fr)} 

This graph is the polygonalization o/ {7i(t)}o...j\/-i . 

The following theorem guarantees that Algorithm[2]works. The proof follows 
that of Theorem 12.81 and is given in Appendix [A] An application is shown in 
Fig. El 

Theorem 3.2 Suppose that Assumptions[^\^ and\^hold. Suppose also that 

^ > 4C + 4ee + 2.lK„e^ (3.3a) 

e < . (3.3b) 

Then, Algorithm\^ correctly reconstructs the figure. 

Remark 3.3 Consider a point p, which is a noisy sample from some curve in 
the figure. All we can say a priori is that p is close to the true sample p*, i.e. 
p g Bq(p^). However, given the knowledge that the polygonalization contains 
the edges (g, p) and (p, r) , we can obtain further information on p^ . Not only 
does lie in Bq{p), but € A\-'^[q) and p^ e v4^'^(r). In short, 

p>. eB^{p)n Ai'^iq) n A^/{f) (3.4) 
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We can therefore improve our approximation to by minimizing cither the 
worst case error, 

p"""" = argminp-sup \p- x| , A = Bc^{p)C^ n A«^«(r) (3.5a) 

or the mean error, 

p""*" = argmin- / \p - x\ dx (3.5b) 

J A 

or some appHcation-dependent functional. Noise in the tangential data can be 
similarly reduced. This is a postprocessing matter after polygonalization, and 
we will not expanded further on this idea in the present paper. 

4 Examples 

4.1 Extracting Topology from MRI images 

In its simplest version, Magnetic Resonance Imaging (MRI) is used to obtain 
the two-dimensional Fourier transform of the proton density in a planar cross- 
section through the patient's body. That is, if p{x) is is the proton density 
distribution in the plane P, then the MRI device is able to return the data p{k) 
at a selection of points k in the Fourier transform domain (fc-space). The number 
of sample points available, however, is finite and covers only the low-frequency 
range in /c-space well. Thus, it is desirable to be able to make use of the limited 
information in an optimal fashion. We are currently exploring methods for 
MRI based on exploiting the assumption that p{x) is piecewise smooth (since 
different tissues have different densities, and the tissues boundaries tend to be 
sharp). Our goal is to carry out reconstruction in three steps. First, we find the 
tissue boundaries (the discontinu- ities). Second, we subtract the influence of 
the discontonuities from the measured fc-space data and third, we reconstruct 
the remainder which is now smooth (or smoother). Standard filtered Discrete 
Fourier Transforms are easily able to reconstruct the remainder, so the basic 
problem is that of reconstructing the edges. 

Using directional edge detectors on the fc-space data, we can extract a set 
of point samples from the edges, together with non-oriented normal directions. 
By means of Algorithm ^ we can reconstruct the topology of the edge set and 
carry out the procedure sketched out above. The details of the algorithm are 
beyond the scope of this article, and will be reported at a later date, but Figure 
[7| illustrates the idea behind the method. Our work on curve reconstruction 
was, in fact, motivated by this application. 

4.2 Figure detection 

A natural problem in various computer vision applications is that of recognizing 
sampled objects that are partially obscured by a complex foreground. As a 
model of this problem, we constructed an (oval) figure, and obscured it by 
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Figure 7: A simulated MRI image. The original image was two circles, together 
with some low frequency "texture". The noise level is 5%. 



covering it with a sequence of curves. Algorithm |4] succesfuUy reconstructs the 
figure, as well as properly connecting points on the horizontally and vertically 
oriented covering curves. The result is shown in Figure[8l Note that the branches 
are not connected to the oval (or each other). 

4.3 Filtering spurious points 

The method provided here is relatively robust with regard to the addition of 
spurious random data points. This is because spurious data points are highly 
unlikely to be connected to any other points in the polygonalization graph. To 
see this, note first that for an incorrect data point to be connected to part of 
the polygonalization at all, it would need to be located in At{p) for some p. 
This is a region of length 0(e) and width O(e^). There are approximately L = 
J2j arclength(7j) such points, for a total volume of e^L. Thus, the probability 
that a spurious point is in some allowed region is roughly 0{Le^). 

The second reason is that even if a spurious point is in some allowed region, 
it is unlikely to point in the correct direction. If an erroneous point q is inside 
Af:(p), it is still not likely that p € A^{q), since the tangent at q must point in the 
direction of p (with error proportional to e^, the angular width of Aii{q)). Thus, 
the probability that the tangent at q points towards p is 0{e^ /2tt). Combining 
these arguments, the probability that any randomly chosen spurious point q is 
connected to any other point in the polygonalization is 0{Le'^). 
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Figure 8: A figure which is partially obscured. Algorithm |3] correctly computes 
its polygonalization, and distinguishes it from the curves in front of it. To avoid 
visual clutter, the tangents are not displayed in this figure. 
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4.3.1 Filtering the data 

The aforementioned criteria suggest that our reconstruction algorithm has ex- 
cellent potential for noise removal. It suggests that if we remove points which 
do not have edges pointing towards other edges, then with high probability we 
are removing spurious edges. 

This notion is well supported in practice. By running Algorithm [4] on a 
figure consisting of 96 true points, and 100 randomly placed incorrect points, 
a nearly correct polygonalization is calculated (Fig. The original curve is 
reconstructed with an error at only one point (the top left corner of the right- 
hand curve). 

Of course, if enough incorrect points are present, some points will even- 
tually be connected by Algorithm [H This can be seen in Figure |9j the line 
segment near (0.9, —0.2) is an edge between two incorrect points. One hint 
that an edge is incorrect is that it points to a leaf. That is, consider a set of 
vertices pQ,pi, . . . ,pn as well as q. Suppose, after approximately computing the 
polygonalization, one finds that the graph contains edges cq — {po,pi),ei = 
iPi,P2), ■ • ■ , e„-i = {pn-i,Pn) and e„ = {pn/2,q)- The vertex g is a leaf, that 
is it is reachable by only one edge. A polygonalization of a set of closed curves 
should not have leaves, suggesting that the edge e„ is spurious. Thus filtering 
leaves is a very reasonable heuristic for noise filtering. 

One final problem with noisy data worth mentioning is that sometimes, an 
incorrect point will be present that lies within the allowed region of a legitimate 
point, and closer to the legitimate point than the adjacent points along the curve. 
This will prevent the correct edge from being added. This can be remedied 
by adding not only ff at Step 3 of the algorithm, but also points for which 
d^±{pi) whose distance to pi is not much longer than the distance between pi 
and ff. With some luck, this procedure combined with filtering out leaves will 
approximately reconstruct the correct figure. 



Algorithm 3 (PolygonaUzation with Noise Removal) 



Input: 

/ We assume we are given the dataset {pi, ?7ii}i=o...A'-i (which includes spuri- 
ous data), the maximal curvature Km, the noise amplitudes ^, and a parameter 
e satisfying both eKm < 1/a/2 and 2nme^ < S. We assume that adjacent points 
on a given curve are less than a distance e apart, i.e. the curve is e-sampled. 
We also assume we are given the number of leaf removal sweeps I G Z+ and a 
threshold a > I. ] 

1. Compute the graph G = {{pi\i=Q...N-i,E) with edge set: 

E = {{Pi.Pj) ■ Pi e ^e(Pj) and pi G A^{pj)} 

2. For each vertex pi £ {pi}i=o...Af-i-' 
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a. Compute the set of vertices 

^ {Pj ■ iPi^Vj) e E and ± {pj - pi) ■ m, > 0} 

b. Find the nearest tangential neighbors, i.e. 

jf = argmin^^^± ± {pj - p,) ■ rfi^ 

c. Find the set of almost-nearest tangential neighbors: 

Rf = {f e Rf : d^M, ^ < arf} 
3. Compute the graph T — {{pi}i=o...N-i, E') with 

E' = \JW.,r^ ■■ r e R+} U {(k, r) : r e R"} 

i 

4- Search through T for leaves, and remove edges pointing to the leaves. Re- 
peat this I times. 

5. Output r. 

In practice, we have found that a = l.l and I = 4 work reasonably well. 
Figure [TPl illustrates the result of Algorithm [31 both with and without filtering. 

5 Conclusions 

Standard methods for reconstructing a finite set of curves from sample data are 
quite general. By and large, they assume that only point samples are given. In 
some applications, however, additional information is available. In this paper, 
we have shown that if both sample location and tangent information are given, 
significant improvements can be made in accuracy. We were motivated by a 
problem in medical imaging, but believe that the methods developed here will 
be of use in a variety of other applications, including MR tractography and 
contour line reconstruction in topographic maps [131 [5] . 

A Proof of Theorem K2\ 

The proof of Theorem 13.21 follows that of Theorem 12.81 closelv. with minor mod- 
ifications made to account for the noise. To begin, we need to show that the 
noisy allowed region is large enough to separate distinct curves. It is here that 
we use p.3a|) . 

Proposition A.l Suppose i j and assume that (j3.3[) holds. Then B(^(pi) D 
A^'^{pj) = unless Pi andpj are samples from the same curve, and are separated 
by an arc length no larger than Km~"'^7r/2. 
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Point and Tangent data 




Figure 10: The same example as in Figure [51 but with 2000 additional random 
points added (for a total of 2096). The original curve is no longer completely 
reconstructed, but the general shape is still roughly visible, along with many 
more spurious points. The middle figure shows the reconstruction without Step 
4 of Algorithm 3. Filtering leaves with I = 4 improves the situation considerably 
(bottom figure). 
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Proof. For simplicity, suppose that pi £ B^{pj) (since otherwise, pi ^ 

but Pi is not sampled from the same curve as pj. Let 7(t) denote the curve from 

which Pj is sampled. Let p = pj and m = rrij to simplify notation. 

Select points p",rn" to minimize d{pi,L"), where L" ^ p" + fh" , with the 
constraint that d{p" ,p) < and the angle between m" and m is smaller than ^. 
Let a G L" be the point for which d{d,pi) — d{pi, L"). 

It is shown in the proof of Lemma [2.91 that if d{x.L") > Km^"^, then vx ^ 
A^{p") (recall (EH), ^M)- Thus, if d{p^,L") > n^e"^ + C for any p",m", then 
X ^ A^{p") for each x e BQ{pi) and hence pi ^ v4^'^(p). We will show this to be 
the case. 

By the second triangle inequality, we have the bound: 

d{pi,L") = d{p^,d) > d{p'i,d) - d{pi,p'i} 

> d{p'i, b) - d{b, a) ~ d{p„p'i} > (5 - d{b, a) - C (A.l) 

where b is the point on 7(i) closest to p^. Once we show this is greater than 
Km^^, the proof is complete. 

Let L' = -\- m't (with jf and m! being true samples of 7(i), approximated 
by p and vm) . Then we have the bound: 

d{b, d) < sup d(6, g) + d{g, a) < d{b, L') + d{d, L') < n^e^ + d{d, L') (A.2) 

The bound on d(6, g) follows since 6 is a sample from 7(<) (recalling (|2.4p . (|2.5p ). 
Since d = p" + m" s (for some s € [— e, e]), we can perform the bound: 

d{d, L') — sup d^djjf + m't) < sup sup d(p" + m" s^j/ + m't) 

t|<£ |s|<£|t|<C 

< d{p" + m"e, p' + m'e) < 2C + 4^6 (A.3) 

In (|A.3|1 . we assume m' and m" are oriented the same way. It is easy enough to 
see that the sup is achieved at the endpoints; we then use the triangle inequality 
d{p",vp') < d{p",p) + d{p,p') < 2(, and similarly for the tangents. Thus, we 
find that: 

(IX2|) < K„,e2 + 2C + 4e^ (A.4) 
Plugging this into (jA.ip shows that: 

d{p^, L") > (5 - (4C + 4ee + K^e") > LIk^e^ > ^^^2 ^ ^ (^ 5) 

where the last inequality follows from (|3.3ap . □ 

This shows that the graph G computed in Step 1 separates distinct curves. 
The next result parallels Proposition l2.101 and shows that the noisy allowed 
region contains nearby points on the polygonalization. 

Proposition A.2 Suppose the figure is sampled at a rate satisfying (j2.2bp . 
Then G contains the polygonalization of the figure. 
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Proof. The point pi and tangent are close to some point p^, rh[ on the figure 
{li{t)}o...M-i', in particular, \pi — < C ^nd arccos(mi • rn'j) < ^ . Similarly, 
there is a point on the figure a distance no more than ^ away from pj. By 
Proposition [Uni p'j G Since p'^ € S^(Pj) and G C A^'^p,), 

we find that G B(^{pj) n A^'^(pi) 7^ 0. Repeating this argument with i and j 
interchanged shows that p.2p holds, and {pi,Pj) is an edge of G. □ 

Proposition A. 3 Fis a point pi = p G {pi}i=o...N-i, o,nd suppose that As- 
sumption\^ holds. Choose a tangent vector ifiQ and fix an orientation. Consider 
the set of points Pj such that {p,Pj) is an edge in G (as per Step 1 of Algorithm 
and [pj ~ p) ■ Too > 0. Suppose also that e satisfies (j3.3bp . 
Then the nearest tangential neighbor ofp (i.e. the edge for which (pj ~p)-rno 
is minimal) is the edge in the polygonalization ofj. 

Proof. The idea of the proof follows that of Lemma [2.121 closely, but we must 
adjust for our uncertainty as to the point and tangent. 

The curve itself has the parameterization ^{t) = ff + rn't + rn'^w{t), by 
Lemma 12.111 and this is valid for \t\ < Km~^ ■ However, we do not know 
and to', only p and m. We wish to find the point pj for which to' • (p^ — f?) is 
minimal, and we approximate this by finding the point for which m • {pj — p) is, 
minimal. 

Using the fact that rn ■ {pj ^ p) — jn ■ (pk — p) = rfi ■ {pj — Pk), we find that 

rh ■ {pj - p) ~ rn ■ {pk - p) ^ rn ■ {pj - pk) = 

TO ■{p'^-p'k)+rn- {[pj - p'j] - [pk - p'k]) 
= m ■ {pj - p^) + {rh~ to) • {pj - pfc) +rn- {[pj - p^] - [pk - Pk]) (A.6) 

The second and third terms on the right side of (jA.6P are the error terms. We 
have the bound: 

|(m - to') • (p'j - p'^) + rn ■ {[pj - p'j\ - [pu - p'fe])| 

< ^sin2(e) + (l-cos(e))2 Ip^. - p1 I + 2C < 2(^6 + C) 

We wish to find the j for which (IA.6P is negative for every k. If we can show 
that |to' ■ [fj — P^k)\ > 2(^6 + C)j we are done. 

If we observe that p'j = + m'tj + w{tj)rn'-^ (using the notation of Lemma 
12. lip , and similarly p^ = ff + jri'tk+w{tk)m'^ , we find then that to'-(p^ — p^) = 
tj tk . 

It is here we use the fact that \pj — Pk\ = {tj — tkY + (w{tj) — w{tk)Y ^ 
[(1 + 23/2)(2^e + ()]2. With f{z) as in Lemma[21Il wc find that: 

\{w{t,) - w{tk))\ = \w'{y)\ \{tj ~tu)\< j,(^f-i^^^^y)^ \ih - tk)\ 

= il + {f-\^my)rr/'\it,-t,)\ 
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for some y € [0, If k„j?/ < l/\/2 (i.e. (j2.2bp is satisfied), then / ^{nmV) < 

1 and (1 + f-^(Kmvf) < 2. Thus: 

> {t, - f + [(1 + (/-i(^„y))2)-V2(i^. _ 

> |p;--p;.|'> [(l + 23/2)(2Ce + C)P (A.7) 
(the last inequality follows from Assumption [3]) implying that \tj — tk\ > 2(^e + 

C). □ 
B Speeding it up: an 0{N log N) algorithm 

As remarked earlier, Algorithm [4] and [2] run in O(iV^) time as written. The slow 
step is Step 1 which involves comparing every point/tangent pair to every other 
such pair. This scaling issue can be remedied by using a spatially adaptive data 
structure [12] 

A caveat: there are two different ways of increasing N. The first (increasing 
outward) is by taking larger figures, with the sampling rate held fixed. The 
second (increasing inward) is by holding the figure size fixed, but increasing the 
sampling rate. We are interested primarily in the first case, and we will treat 
this case only. Therefore, we make the following additional assumption: 

Assumption 4 We assume that the density of points in the input data is 
bounded above, i.e.: 

\{pi}i=o...N-i r\B\ 
sup < Pm (B.l) 

Note that this always holds in the case of noisy data. In this case, Assump- 
tion [3] combined with (|3.3p implies that 

1 1 

Orr, < ^rr— < 



(1 + 23/2)(2e + OS " (1 + 23/2)(2e + C)(4C + + ^.In^e^) ' 

In computing Step 1 of Algorithm [4] or [2l we must determine whether two 
points are in each other's allowed region (or a ball of radius C about the noisy 
allowed region). Note that Ae{pi) C Be{pi), so if \pi — Pj\ > e, then clearly 
the edge {pi,Pj) ^ G. Similarly, for the noisy case, if \pi — Pj\ > e + 2^, then 
{PijPj) ^ G. We exploit this fact by using quadtrees, which allow us to avoid 
comparing points more than a distance e apart. 



Algorithm 4 (Fast Computation of the Graph G) 



20 



Input: [ We assume we are given the dataset {pi, mi}i=o...JV-i7 the maximal 
point density p„i and the sampling e. We also take the parameter X ~ e (noise 
free case) or X — e + (noisy case). ] 

1. Compute a quadtree Q storing (pi^rhi) pairs. The splitting criteria for a 
node is when the node contains more than pmX^ points. 

2. Initialize the graph G = {{pi\i=o...N-i, E) with empty edge set. 

3. For each point pi, iterate over the points pj contained in the node contain- 
ing Pi and all of its nearest neighbors. If 

Pi e A^ipj) and pj e A^{pi), 

then add the edge {pi^Pj) to the graph G. 

4-. Return G. 

Initializing the quadtree in step 1 is an O(A^logA^) operation. Assumption 
m implies that the width of a node will be no smaller than A; thus, a node 
containing a point pi together with it's nearest neighbors contains the allowed 
region. The comparison at step 3 involves at most PmX"^ points, regardless of 
N. Thus, the complexity of this algorithm is 

0{N\ogiN)+NpraX''). (B.2) 
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